%matplotlib notebook
from numpy import *
from matplotlib.pyplot import *
from numpy.linalg import qr
from scipy.linalg import hilbert
from numpy.linalg import eig
def display_mat(msg,A):
print(msg)
display(A)
print("")
The least squares problem
where , has four associated "conditioning" problems, described in the table in Theorem 18.1 of TB (page 131). These are
Sensitivity of to right hand side vector ,
Sensitivity of the solution to right hand side vector ,
Sensitivity of to the coefficient matrix , and
Sensitivity of the solution to the coefficient matrix .
Sensitivity of to a perturbation in .
In TB Lecture 12, the relative condition number is defined as
Arguing directly from this definition, establish the condition number of with respect to perburbations in given by TB Lecture 18
Hint: The input "" in this problem is and the output (or model) "" is . Show geometrically that the supremum is attained with .
Since the relative condition number is defined as ;
then the relative condition number of with respect to perburbations in is given by;
where is the the projetion of onto the , is a pertubation in with a correponding pertubation in (which is the projetion of onto the ) . All this is illustrated geometrically in the image below;
From the above image we notice that where is the angle between and the .
The relative condition number of with respect to perburbations in then becomes;
But to find the , we must maximize the ratio .
From the diagram above we see that; for , will always be greater than and therefore to maximize , must be equal to zero and .
This means that we obtain when and therefore the condition number of with respect to perburbations in becomes;
For , the condition number is . Illustrate what this means by considering the least squares problem
Use the results in TB 11.11 and 11.12 (page 82) to determine the projection operator for this problem. Then compute and show that . Find a perturbation so that . Explain what a condition number might mean here. Illustrate your argument graphically.
From the above system we have that;
and
To find the the projection operator for this problem, we use , but
To find , will let where and such that that is; Equating elements in the two vectors we obtain; Subbstituting for x in equation (26), we obtain; This means that can take on any real number ie. .
Chosing , this gives and
This can be geometrically illustrated as shown in the image below;
From the graph above, we see that the projection of onto the is a zero vector, that is . Surprisingly, a small change in say results into non-zero pertubation in the projection. This means that the residual and that the change in solution is very very large.
With reference to the above ideas, we conclude that when , the condition number blows up i.e. .
The same conclusion can be reached by using the definition of the condition number of with respect to perburbations in ;
Now consider the problem
For this problem, show that . What is qualitatively different about this problem than the problem in which ?
Since in 1(a) we proved that and the angle between and the is zero, the condition numder for this case is then given as;
For this problem, we see that the projection of onto the column space of is itself, that is . This means that is in the , the residual and that the change in the solution is very very small which isn't the case when .
A=array([[1,1],[1,1.0001],[1,1.0001]])
b=array([[2],[0.0001],[4.0001]])
A_inv = linalg.pinv(A)
#A_inv = linalg.inv(A.T@A)@A.T
P=A@A_inv
display_mat(" A = ",A)
display_mat("Pseudo inverse of matrix A = ",A_inv)
display_mat("P = ",P)
A =
array([[1. , 1. ],
[1. , 1.0001],
[1. , 1.0001]])
Pseudo inverse of matrix A =
array([[10000.99999998, -4999.99999999, -4999.99999999],
[-9999.99999998, 4999.99999999, 4999.99999999]])
P =
array([[1.00000000e+00, 9.09494702e-13, 9.09494702e-13],
[0.00000000e+00, 5.00000000e-01, 5.00000000e-01],
[0.00000000e+00, 5.00000000e-01, 5.00000000e-01]])
#x1 = linalg.lstsq(A,b)
#display_mat(" x = ",x1[0])
x = A_inv@b
y = P@b
display_mat(" x = ",x)
display_mat(" y = ",y)
x =
array([[1.],
[1.]])
y =
array([[2. ],
[2.0001],
[2.0001]])
k =linalg.cond(A)
K = linalg.norm(A)*linalg.norm(A_inv)
theta = arccos(linalg.norm(y)/linalg.norm(b))
eta = (linalg.norm(A)*linalg.norm(x))/linalg.norm(y)
display_mat(" K(A) = ",k)
display_mat(" theta = ",theta)
display_mat(" eta = ",eta)
K(A) =
42429.235416083044
theta =
0.684702873261185
eta =
1.000000000833278
##d
K_by = 1/cos(theta)
K_bx = k/(eta*cos(theta))
K_Ay = k/(cos(theta))
K_Ax = k+ (((k**2) *tan(theta))/eta)
display_mat(" K_by = ",K_by)
display_mat(" K_bx = ",K_bx)
display_mat(" K_Ay < or = ",K_Ay)
display_mat(" K_Ax < or = ",K_Ax)
K_by =
1.290977236078942
K_bx =
54775.1770207547
K_Ay < or =
54775.17706639765
K_Ax < or =
1469883252.449082
The first condition number is attained when and this occurs when is equal to zero except in the first entries ( is the number of columns of ),
I therefore used as and this gave
The condition number is attained when and this occurs when is equal to zero except in the entry.
For this I used this gave
The condition number is attained when where is the vector orthogonal to the and is the second column of the right singular matrix .
for this I used , and obtained
Show that if is an eigenvalue/eigenvector pair for matrix , then is an eigenvalue/eigenvector pair for the matrix .
Why is this observation useful when using the power iteration to find an eigenvalue close to ?
If is an eigenvalue of matrix with a corresponding eigenvector , then we have that;
Subtracting from equation() above we obtain;
Since is not an eigenvalue of , then is invertible and therefore multiplying the above equation with , we get;
Therefore from the above equation, we can conclude that is an eigenvalue/eigenvector pair for the matrix .
Why is this observation useful when using the power iteration to find an eigenvalue close to ?
This observation is useful when using the power iteration to find an eigenvalue close to because the dominant eigenvalue of will be from which we can easily obtain .
Exercise 29.1 (Lecture 29, TB page 223). This is a five part problem that asks you to code an eigenvalue solver for a real, symmetric matrix using the shifted algorithm. Do your code in Python, using the Numpy qr algorithm where needed.
The basic steps are :
Reduce your matrix to tridiagonal form. You may use the hessenberg code we wrote in class.
Implement the unshifted code (also done in class). Use the Numpy routine qr. Your iteration should stop when the off diagonal elements are smaller (in absolute value) than .
Find all eigenvalues of a matrix using the "deflation" idea described in Algorithm 28.2.
Introduce the Wilkinson shift, described in Lecture 29.
Your code should work for a real, symmetrix matrix
Your code does not have to be efficient in the sense of optimizing the cost of matrix/vector multiplies and so on.
Apply your algorithm to the Hilbert matrix scipy.linalg.hilbert. The entries of the Hilbert matrix are given by
def display_mat(msg,A):
print(msg)
fstr = {'float' : "{:>10.8f}".format}
with printoptions(formatter=fstr):
display(A)
print("")
def hessenberg(A):
m,n = A.shape
assert m == n, "A must be square"
H = A.copy()
Qt = eye(m)
for j in range(m-1):
x = H[j+1:,j:j+1]
I = eye(m-j-1)
s = 1 if x[0] > 0 else -1 # sign function, with sign(0) = 1
v = s*linalg.norm(x,2)*I[:,0:1] + x
vn = linalg.norm(v,2)
if vn!=0:
v = v/vn
F = I - 2*(v@v.T)
H[j+1:,j:] = F@H[j+1:,j:]
H[0:,j+1:] = H[0:,j+1:]@F # Apply F to the right side of H.
break
return H
def eigenvalue_QR_solver(A,kmax,method=''):
H = hessenberg(A)
Ak = H.copy()
mu = 0
e = zeros((kmax,1))
lam = zeros((size(H,0),1))
for k in range(kmax):
m = size(Ak,0)
mu = 0
Q,R = qr(Ak - mu*eye(m))
Ak = R@Q + mu*eye(m)
if method == 'unshifted':
mu=mu
if abs(Ak[-1,-2]) < 1e-12:
print ("number of iterations required=",k+1)
lam[0:m] = array([diag(Ak)]).T
break
elif method =='Rayleigh shift':
mu = Ak[-1,-1]
if m==1:
e[0] = abs(Ak[-1,-1])
lam[0] = Ak[-1,-1]
print ("number of iterations required to find the eigenvalue below=",k+1)
print("eigenvalue = {:12.4e} \n".format(Ak[-1,-1]))
break
else:
e[1:k] = abs(Ak[-1,-2])
if abs(Ak[-1,-2]) < 1e-12:
print ("number of iterations required to find the eigenvalue below=",k+1)
print("eigenvalue = {:12.4e} \n".format(Ak[-1,-1]))
lam[1:m]= Ak[-1,-1]
Ak = Ak[0:m-1,0:m-1]
elif method=='Wilkinson shift':
if m==1:
print ("number of iterations required to find the eigenvalue below=",k+1)
print("eigenvalue = {:12.4e} \n".format(Ak[-1,-1]))
lam[0]=Ak[0,0]
break
else:
sigma = (Ak[-2,-2]-Ak[-1,-1])/2
if sigma !=0:
sn = sign(sigma)
else:
sn = -1
mu = Ak[m-1,m-1] - (sn*Ak[m-2,m-1]**2/(abs(sigma) + sqrt(sigma**2+Ak[m-2,m-1]**2)))
if abs(Ak[-1,-2]) < 1e-12:
print ("number of iterations required to find the eigenvalue below=",k+1)
print("eigenvalue = {:12.4e} \n".format(Ak[-1,-1]))
lam[1:m]= Ak[-1,-1]
Ak = Ak[0:m-1,0:m-1]
return e,sort(lam)[::-1]
H = hilbert(4)
# TRUE EIGENVALUES
eval_true1,evec_true = eig(H)
eval_true1 = sort(array(eval_true1).reshape(size(H,0),1),axis=0)
eval_true1 = sort(eval_true1)
eval_true1
array([[9.67023040e-05],
[6.73827361e-03],
[1.69141220e-01],
[1.50021428e+00]])
# USING UNSHIFTED QR
e1,lam1= eigenvalue_QR_solver(H,kmax= 20,method='unshifted')
#lam1 = sort(lam1)[::-1]
display_mat("Eigenvalues = ",lam1)
display_mat("Error = ",linalg.norm(eval_true1 - lam1))
number of iterations required= 6 Eigenvalues =
array([[0.00009670],
[0.00673827],
[0.16914122],
[1.50021428]])
Error =
4.282800079194783e-12
#USING RAYLEIGH SHIFT
e2,lam2= eigenvalue_QR_solver(H,kmax= 20,method='Rayleigh shift')
#lam2 = sort(lam2)[::-1]
display_mat("Eigenvalues = ",lam2)
display_mat("Error = ",linalg.norm(eval_true1 - lam2))
number of iterations required to find the eigenvalue below= 6 eigenvalue = 9.6702e-05 number of iterations required to find the eigenvalue below= 8 eigenvalue = 6.7383e-03 number of iterations required to find the eigenvalue below= 13 eigenvalue = 1.6914e-01 number of iterations required to find the eigenvalue below= 14 eigenvalue = 1.5002e+00 Eigenvalues =
array([[0.00009670],
[0.00673827],
[0.16914122],
[1.50021428]])
Error =
2.2303812898233666e-16
# USING WILKINSON SHIFT
e3,lam3= eigenvalue_QR_solver(H,kmax= 20,method='Wilkinson shift')
#lam3 = sort(lam3)[::-1]
display_mat("Eigenvalues = ",lam3)
display_mat("Error = ",linalg.norm(eval_true1 - lam3))
number of iterations required to find the eigenvalue below= 6 eigenvalue = 9.6702e-05 number of iterations required to find the eigenvalue below= 8 eigenvalue = 6.7383e-03 number of iterations required to find the eigenvalue below= 13 eigenvalue = 1.6914e-01 number of iterations required to find the eigenvalue below= 14 eigenvalue = 1.5002e+00 Eigenvalues =
array([[0.00009670],
[0.00673827],
[0.16914122],
[1.50021428]])
Error =
2.2303812898233666e-16
a = arange(15,0,-1)
A = diag(a) + ones((15,15))
# TRUE EIGENVALUES
eval_true,evec_true = eig(A)
eval_true = sort(array(eval_true).reshape(size(A,0),1),axis=0)
eval_true = sort(eval_true)
eval_true
array([[ 1.21465537],
[ 2.25695098],
[ 3.28777559],
[ 4.31431185],
[ 5.33895988],
[ 6.36294449],
[ 7.38709275],
[ 8.41211207],
[ 9.43874576],
[10.46792166],
[11.50098302],
[12.54018637],
[13.59013196],
[14.664097 ],
[24.22313127]])
# USING UNSHIFTED QR
e11,lam11= eigenvalue_QR_solver(A,kmax=1000,method='unshifted')
#lam1 = sort(lam1)[::-1]
display_mat("Eigenvalues = ",lam11)
display_mat("Error = ",linalg.norm(eval_true - lam11))
number of iterations required= 42 Eigenvalues =
array([[1.21465537],
[2.25695098],
[3.28777559],
[4.31431185],
[5.33895991],
[6.36294484],
[7.38709463],
[8.41211890],
[9.43876514],
[10.46797027],
[11.50111645],
[12.54132790],
[13.58892879],
[14.66394811],
[24.22313127]])
Error =
0.0016713681173739802
#USING RAYLEIGH SHIFT
e22,lam22= eigenvalue_QR_solver(A,kmax=1000,method='Rayleigh shift')
#lam2 = sort(lam2)[::-1]
display_mat("Eigenvalues = ",lam22)
display_mat("Error = ",linalg.norm(eval_true - lam22))
number of iterations required to find the eigenvalue below= 42 eigenvalue = 1.2147e+00 number of iterations required to find the eigenvalue below= 70 eigenvalue = 2.2570e+00 number of iterations required to find the eigenvalue below= 97 eigenvalue = 3.2878e+00 number of iterations required to find the eigenvalue below= 124 eigenvalue = 4.3143e+00 number of iterations required to find the eigenvalue below= 151 eigenvalue = 5.3390e+00 number of iterations required to find the eigenvalue below= 178 eigenvalue = 6.3629e+00 number of iterations required to find the eigenvalue below= 205 eigenvalue = 7.3871e+00 number of iterations required to find the eigenvalue below= 232 eigenvalue = 8.4121e+00 number of iterations required to find the eigenvalue below= 259 eigenvalue = 9.4387e+00 number of iterations required to find the eigenvalue below= 286 eigenvalue = 1.0468e+01 number of iterations required to find the eigenvalue below= 313 eigenvalue = 1.1501e+01 number of iterations required to find the eigenvalue below= 345 eigenvalue = 1.2540e+01 number of iterations required to find the eigenvalue below= 348 eigenvalue = 1.3590e+01 number of iterations required to find the eigenvalue below= 349 eigenvalue = 1.4664e+01 number of iterations required to find the eigenvalue below= 350 eigenvalue = 2.4223e+01 Eigenvalues =
array([[1.21465537],
[2.25695098],
[3.28777559],
[4.31431185],
[5.33895988],
[6.36294449],
[7.38709275],
[8.41211207],
[9.43874576],
[10.46792166],
[11.50098302],
[12.54018637],
[13.59013196],
[14.66409700],
[24.22313127]])
Error =
7.965709004822976e-14
# USING WILKINSON SHIFT
e33,lam33= eigenvalue_QR_solver(A,kmax=1000,method='Wilkinson shift')
#lam3 = sort(lam3)[::-1]
display_mat("Eigenvalues = ",lam33)
display_mat("Error = ",linalg.norm(eval_true - lam33))
number of iterations required to find the eigenvalue below= 42 eigenvalue = 1.2147e+00 number of iterations required to find the eigenvalue below= 70 eigenvalue = 2.2570e+00 number of iterations required to find the eigenvalue below= 97 eigenvalue = 3.2878e+00 number of iterations required to find the eigenvalue below= 124 eigenvalue = 4.3143e+00 number of iterations required to find the eigenvalue below= 151 eigenvalue = 5.3390e+00 number of iterations required to find the eigenvalue below= 178 eigenvalue = 6.3629e+00 number of iterations required to find the eigenvalue below= 205 eigenvalue = 7.3871e+00 number of iterations required to find the eigenvalue below= 232 eigenvalue = 8.4121e+00 number of iterations required to find the eigenvalue below= 259 eigenvalue = 9.4387e+00 number of iterations required to find the eigenvalue below= 286 eigenvalue = 1.0468e+01 number of iterations required to find the eigenvalue below= 313 eigenvalue = 1.1501e+01 number of iterations required to find the eigenvalue below= 345 eigenvalue = 1.2540e+01 number of iterations required to find the eigenvalue below= 348 eigenvalue = 1.3590e+01 number of iterations required to find the eigenvalue below= 349 eigenvalue = 1.4664e+01 number of iterations required to find the eigenvalue below= 350 eigenvalue = 2.4223e+01 Eigenvalues =
array([[1.21465537],
[2.25695098],
[3.28777559],
[4.31431185],
[5.33895988],
[6.36294449],
[7.38709275],
[8.41211207],
[9.43874576],
[10.46792166],
[11.50098302],
[12.54018637],
[13.59013196],
[14.66409700],
[24.22313127]])
Error =
7.965709004822976e-14